Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Align documentation about HTTPRouteTimeouts.BackendRequest timeout scope #3462

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

rtribotte
Copy link
Contributor

What type of PR is this?

/kind documentation

What this PR does / why we need it:

This PR aligns the HTTPRouteTimeouts.BackendRequest timeout documentation.
It is stated in the GEP 1742 sequence diagram illustrating the timeouts boundaries that the HTTPRouteTimeouts.BackendRequest timeout runs until the responses headers are received:

U->>P: Starts Response
U->>P: Finishes Headers
note right of P: timeouts.backendRequest end time

The documentation for the CRD was mentioning a timeout running until the end of the response, which is not consistent and this PR aims to align expectations.

On the side, in the GEP 1742, the Traefik mermaid sequence diagram is adjusted to reflect more accurately the actual supported timeouts.

Which issue(s) this PR fixes:

Does this PR introduce a user-facing change?:

Align documentation about HTTPRouteTimeouts.BackendRequest timeout scope.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/documentation Categorizes issue or PR as related to documentation. labels Nov 20, 2024
@k8s-ci-robot k8s-ci-robot requested a review from candita November 20, 2024 16:06
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 20, 2024
@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/gep PRs related to Gateway Enhancement Proposal(GEP) needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Nov 20, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @rtribotte. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@youngnick
Copy link
Contributor

/ok-to-test

This seems reasonable, but I'll need to have a think about it in case it counts as a breaking change.

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 21, 2024
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 14, 2025
Copy link
Member

@robscott robscott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rtribotte! The Traefik related changes to the GEP look great, thanks for the corrections. Want to get more feedback on the changes to the timeout definition.

/hold

@@ -347,7 +347,7 @@ type HTTPRouteTimeouts struct {

// BackendRequest specifies a timeout for an individual request from the gateway
// to a backend. This covers the time from when the request first starts being
// sent from the gateway to when the full response has been received from the backend.
// sent from the gateway to when the response headers have been received from the backend.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a significant change, would appreciate some additional reviews on this.

/cc @youngnick @howardjohn @arkodg @kate-osborn @mlavacca

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC this breaks Envoy implementations?

Copy link
Contributor

@youngnick youngnick Feb 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in Envoy implementations, there's no timeout for the end of the request, only the end of the headers. So this would not be possible to fulfill in any Envoy-based implementation.

From the update to the Traefik timeouts, I can see that Traefik doesn't support an end-of-headers timeout, which puts us in a tricky position (and is one of the reasons why this GEP took so long to get done).

Edit: I was looking at the request timeouts, not the response ones, sigh.

I think in this case, the godoc is the correct behavior, and we should be changing the diagram to match the godoc.

(In general, for Gateway API, the godoc is the canonical location for specification things, so if there's a mistake, we fix other things first, as the godoc is part of the stable version).

This field is now Standard and so this is definitely a breaking change that we can't do without an API revision.

Sorry @rtribotte this is exactly the sort of thing I was worried about when I said I needed to do a closer read.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it the opposite? Envoy can do "end of response" not "end of response headers"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦 Looking at the wrong timeout section. Let me edit my comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks and sorry for the delay here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@youngnick I have updated the PR, nonetheless, I have a question regarding the timeouts.request end time, the go doc says:

Request specifies the maximum duration for a gateway to respond to an HTTP request.

but also

This timeout is intended to cover as close to the whole request-response

According to this, I think the timeouts.request end time is not well placed in diagram, WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's why it says "as close as" - from my reading of the timeouts available across implementations, there's not enough commonality to be able to say "this timeout covers exactly from here to here and only those things" - whatever we pick, some implementation can't do it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@youngnick Ok, so if it is better to remain vague to match most implementations, could I reflect this tolerance margin on the diagram, as for the timeouts.request start time (min and max)?

It makes me wonder why the tolerance margin, regarding the exact covering a timeout has, has not been introduced when defining the timeouts.backendRequest timeout as well.
Now that the timeouts.backendRequest covering is clarified and aligned in the documentation, it suits Envoy implementation, but not Traefik.

Would you be inclined to introduce a tolerance margin that suits the Traefik implementation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a tolerance margin as a field? I need to think about that for a while, sorry.

@k8s-ci-robot
Copy link
Contributor

@robscott: GitHub didn't allow me to request PR reviews from the following users: kate-osborn.

Note that only kubernetes-sigs members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

This seems like a significant change, would appreciate some additional reviews on this.

/cc @youngnick @howardjohn @arkodg @kate-osborn @mlavacca

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@rtribotte rtribotte force-pushed the fix-backendtimeout-doc branch from 14f3878 to 65d5bf4 Compare February 18, 2025 08:33
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 18, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: rtribotte
Once this PR has been reviewed and has the lgtm label, please assign thockin for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@youngnick
Copy link
Contributor

One small thing while we're fixing the last sequence diagram, but I think this looks as good as we can do given the multiple-datapath limitations.

@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Feb 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/documentation Categorizes issue or PR as related to documentation. kind/gep PRs related to Gateway Enhancement Proposal(GEP) ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants